Parallel graph reduction for divide-and-conquer applications† Part II - program performance
نویسندگان
چکیده
An extensible machine architecture is devised to efficiently support a parallel reduction model of computation. The organisation of the machine is designed to match the behaviour of the application programs. A pilot implementation of the architecture is used to obtain an execution profile of the various applications. These profiles are used with a performance model to calculate optimal schedules of the applications. The resulting speedup figures give an upper bound for the performance gain that may be attained on a full implementation of the architecture. The most important result is that each application allows for a processor utilisation of over 50% to be attained on our parallel architecture. Ke y words: local memory architecture multiple processor system optimal scheduling parallel graph reduction performance measurement
منابع مشابه
Parallel graph reduction for divide-and-conquer applications† Part I - program transformations
A proposal is made to base parallel evaluation of functional programs on graph reduction combined with a form of string reduction that avoids duplication of work. Pure graph reduction poses some rather difficult problems to implement on a parallel reduction machine, but with certain restrictions, parallel evaluation becomes feasible. The restrictions manifest themselves in the class of applicat...
متن کاملParallel Combinator Reduction: Some Performance Bounds
A parallel graph reduction machine simulator is described. This performs combinator reduction and can simulate various different parallel reduction strategies. A number of functional programs are examined, and experimental results presented comparing the amount of parallelism obtainable using explicit divide-and-conquer with the maximum amount of parallelism available in the programs. Ke ywords...
متن کاملKinematic Identification of Parallel Mechanisms by a Divide and Conquer Strategy
This paper presents a Divide and Conquer strategy to estimate the kinematic parameters of parallel symmetrical mechanisms. The Divide and Conquer kinematic identification is designed and performed independently for each leg of the mechanism. The estimation of the kinematic parameters is performed using the inverse calibration method. The identification poses are selected optimizing the observab...
متن کاملN -Graphs: A Topology for Parallel Divide-and-Conquer on Transputer Networks
A parallel implementation of a divide-and-conquer template (skeleton) is derived systematically from its functional speciication. The implementation makes use of a new processor topology for divide-and-conquer, called N-graph, which suits transputer networks well: there are not more than 4 links per processor, overlapping of computations and communication within a processor is exploited, the pr...
متن کاملDampvm/dac Programming, Tuning and Automatic Parallelization of Irregular Divide-and-conquer Applications in Programming, Tuning and Automatic Parallelization of Irregular Divide-and-conquer Applications in Dampvm/dac
This paper presents a new object oriented framework DAMPVM/DAC which is implemented on top of DAMPVM and provides automatic partitioning of irregular divide-andconquer (DAC) applications at runtime. The processes are then mapped dynamically to processors taking into account their speeds and even loads by other user processes. The paper presents the programming interface (API) of the framework, ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2009